In the notebook, import plotly and use enable direct plotting in the current Notebook.
import plotly.offline as py
py.init_notebook_mode (connected=True)
Import pycoQC main class
from pycoQC.pycoQC import pycoQC
pycoQC repository contains 6 example sequencing summary files generated with various version of Albacore. Each of those files contains only 10,000 reads.
Larger versions of these files are also available from https://www.ebi.ac.uk/~aleg/data/pycoQC_test/
pycoQC is a simple class that is initialized with a sequencing_summary file generated by ONT Albacore.
The instantiated object can be subsequently called with various methods that will generates tables and plots.
There are a few different ways to get help for all the public package functions:
?pycoQC.channels_activityhelp (pycoQC.channels_activity)shift + tab All the plots are generated with the offline version of plotly for Python.
All the plotting methods return a plotly Figure object that can be used by users for further customization or export in various format.
In addition, users can also customize the figures online in a user friendly environment by clicking on "Edit in Chart Studio" in the upper right corner of each figures.

Similarly static pictures can be exported using the "Download plot as a png" button.

Upon initialization pycoQC reads the sequencing summary file, runs a series of tests and pre-process the data for plotting methods.
PycoQC can read compressed sequencing_summary.txt files (‘gzip’, ‘bz2’, ‘zip’, ‘xz’) and can load a summary file directly from an URL
Depending on the run type and the version of Albacore used some informations might not be available. In particular calibration reads were not flagged in earlier version of Albacore. When the field is available those reads are automatically discarded. Similarly barcodes information are only available in multiplexed runs.
The type of run (1D or 1D2) is automatically detected but can be explicitly enforced with run_type if needed
There is often several runids are present in a single sequencing_summary file. Unfortunately there are no ways to know the correct order based on the information contained in the sequencing_summary.txt file alone. By default pycoQC will automatically reorder the runs by decreasing throughput, which should normally reflect the sequencing order. However if you know the order you can specify it at initialisation with the option runid_list. This option can also be used to select specific run IDs
By default pycoQC assumes that the minimal mean quality for a "pass" read is 7 (same as default Albacore value). However if you want to adjust the value, you can specify it at initialisation with min_pass_qual.
help (pycoQC.__init__)
p = pycoQC("https://www.ebi.ac.uk/~aleg/data/pycoQC_test/Albacore-2.3.1_basecall-1D-RNA_sequencing_summary.txt.gz", verbose=True, min_pass_qual=10)
The summary method generate a simple summary table with a clickable button to switch from "all reads" to "pass reads" only
help(pycoQC.summary)
p = pycoQC("./data/Albacore-1.2.3_basecall-1D-RNA_small_sequencing_summary.txt.gz")
fig = p.summary()
pycoQC has 3 methods to visualize the distribution of mean quality scores and of estimated read length:
reads_len_1D: A distribution histogram of estimated read length in logarithmic scalereads_qual_1D: A distribution histogram of mean quality scoresreads_len_qual_2D: A density contour plot of estimated read length vs mean quality scores in semilog scaleAlthough we recommend to stick to default values, all 3 methods allow users to customize the plots.
nbins for the 1D plots and len_nbins / qual_nbins for the 2D plotcolor/colorscale, width and heighthelp(pycoQC.reads_len_1D)
p = pycoQC("./data/Albacore-2.1.10_basecall-1D-RNA_small_sequencing_summary.txt.gz")
fig = p.reads_len_1D()
help(pycoQC.reads_qual_1D)
p = pycoQC("./data/Albacore-2.1.10_basecall-1D-RNA_small_sequencing_summary.txt.gz")
fig = p.reads_qual_1D()
help(pycoQC.reads_len_qual_2D)
p = pycoQC("./data/Albacore-2.1.10_basecall-1D-DNA_small_sequencing_summary.txt.gz")
fig = p.reads_len_qual_2D ()
p = pycoQC ("./data/Albacore-1.2.3_basecall-1D-RNA_small_sequencing_summary.txt.gz")
fig = p.output_over_time ()
p = pycoQC ("./data/Albacore-2.1.10_basecall-1D-DNA_small_sequencing_summary.txt.gz")
fig = p.qual_over_time ()
p = pycoQC ("./data/Albacore-1.2.3_basecall-1D-RNA_small_sequencing_summary.txt.gz")
fig = p.barcode_counts ()
p = pycoQC ("./data/Albacore-1.7.0_basecall-1D-DNA_small_sequencing_summary.txt.gz")
fig = p.channels_activity ()